19 research outputs found

    How wearing headgear affects measured head-related transfer functions

    Get PDF
    International audienceThe spatial representation of sound sources is an essential element of virtual acoustic environments (VAEs). When determining the sound incidence direction, the human auditory system evaluates monaural and binaural cues, which are caused by the shape of the pinna and the head. While spectral information is the most important cue for elevation of a sound source, we use differences between the signals reaching the left and the right ear for lateral localization. These binaural differences manifest in interaural time differences (ITDs) and interaural level differences (ILDs). In many headphone-based VAEs, head-related transfer functions (HRTFs) are used to describe the sound incidence from a source to the left and right ear, thus integrating both monaural and the binaural cues. Specific aspects, like for example the individual shape of the head and the outer ears (e.g. Bomhardt, 2017), of the torso (Brinkmann et al., 2015), and probably even of headgear (Wersenyi, 2005; Wersenyi, 2017) influence the HRTFs and thus probably as well localization and other perceptual attributes.<par>Generally speaking, spatial cues are modified by headgear, for example by wearing a baseball cap, a bicycle helmet, or a head-mounted display, which nowadays is often used in VR applications. In many real life situations, however, a good localization performance is important when wearing such items, e.g. in order to determine approaching vehicles when cycling. Furthermore, when performing psychoacoustic experiments in mixed-reality applications using head-mounted displays, the influence of the head-mounted display on the HRTFs must be considered. Effects of an HTC Vive head-mounted display on localization performance have already been shown in Ahrens et al. (2018). To analyze the influence of headgear for varying directions of incidence, measurements of HRTFs on a dense spherical sampling grid are required. However, HRTF measurements of a dummy head with various headgear are still rare, and to our knowledge only one dataset measured for an HTC Vice on a sparse grid with 64 positions is freely accessible (Ahrens, 2018).<par>This work presents high-density measurement data of HRTFs from a Neumann KU100 and a HEAD acoustics HMS II.3 dummy head, either equipped with a bicycle helmet, a baseball cap, an Oculus Rift head-mounted display, or a set of extra-aural AKG K1000 headphones. For the measurements, we used the VariSphear measurement system (Bernschütz, 2010), allowing precise positioning of the dummy head at the spatial sampling positions. The various HRTF sets were captured on a full spherical Lebedev grid with 2702 points.<par>In our study, we analyze the measured datasets in terms of their spectrum, their binaural cues, and regarding their localization performance based on localization models, and compare the results to reference measurements of the dummy heads without headgear. The results show that differences to the reference without headgear vary significantly depending on the type of the headgear. Regarding the ITDs and ILDs, the analysis reveals the highest influences for the AKG K1000. While for the Oculus Rift head-mounted display, the ITDs and ILDs are mainly affected for frontal directions, only a very weak influence of the bicycle helmet and the baseball cap on ITDs and ILDs was observed. For the spectral differences to the reference the results show maximal deviations for the AKG K1000, the lowest for the Oculus Rift and the baseball cap. Furthermore, we analyzed for which incidence directions the spectrum is influenced most by the headgears. For the Oculus Rift and the baseball cap, the strongest deviations were found for contralateral sound incidence. For the bicycle helmet, the directions mostly affected are as well contralateral, but shifted upwards in elevation. Finally, the AKG K1000 headphones generally has the highest influence on the measured HRTFs, which becomes maximal for sound incidence from behind.<par>The results of this study are relevant for applications where headgears are worn and localization or other aspects of spatial hearing are considered. This could be the case, for example in mixed-reality applications where natural sound sources are presented while the listener is wearing a head-mounted display, or when investigating localization performance in certain situations, e.g. in sports activities where headgears are used. However, it is an important intention of this study to provide a freely available database of HRTF sets which is well suited for auralization purposes and which allows to further investigate the influence of headgear on auditory perception. The HRTF sets will be publicly available in the SOFA format under a Creative Commons CC BY-SA 4.0 license

    Magnitude-Corrected and Time-Aligned Interpolation of Head-Related Transfer Functions

    Full text link
    Head-related transfer functions (HRTFs) are essential for virtual acoustic realities, as they contain all cues for localizing sound sources in three-dimensional space. Acoustic measurements are one way to obtain high-quality HRTFs. To reduce measurement time, cost, and complexity of measurement systems, a promising approach is to capture only a few HRTFs on a sparse sampling grid and then upsample them to a dense HRTF set by interpolation. However, HRTF interpolation is challenging because small changes in source position can result in significant changes in the HRTF phase and magnitude response. Previous studies greatly improved the interpolation by time-aligning the HRTFs in preprocessing, but magnitude interpolation errors, especially in contralateral regions, remain a problem. Building upon the time-alignment approaches, we propose an additional post-interpolation magnitude correction derived from a frequency-smoothed HRTF representation. Employing all 96 individual simulated HRTF sets of the HUTUBS database, we show that the magnitude correction significantly reduces interpolation errors compared to state-of-the-art interpolation methods applying only time alignment. Our analysis shows that when upsampling very sparse HRTF sets, the subject-averaged magnitude error in the critical higher frequency range is up to 1.5 dB lower when averaged over all directions and even up to 4 dB lower in the contralateral region. As a result, the interaural level differences in the upsampled HRTFs are considerably improved. The proposed algorithm thus has the potential to further reduce the minimum number of HRTFs required for perceptually transparent interpolation

    Analysis and visualization of dynamic human voice directivity

    Get PDF
    In many everyday situations, we experience the influence of the human voice directivity. We perceive loudness and timbre differently when a speaker faces us or turns away from us. Often, we use voice directivity intuitively, for example when facing a person in a meeting or a casual conversation. Such effects of human voice directivity have long been a topic of research. Early studies were carried out more than 200 years ago analyzing the directional radiation of speech in general

    How positioning inaccuracies influence the spatial upsampling of sparse head-related transfer function sets

    Get PDF
    Determining full-spherical individual sets of head-related transfer functions (HRTFs) based on sparse measurements is a prerequisite for various applications in virtual acoustics. To obtain dense sets from sparse measurements, spatial upsampling of sparse HRTF sets in the spatially continuous spherical harmonics (SH) domain can be performed by an inverse SH transform. However, this involves artifacts caused by spatial aliasing and order truncation. In a previous publication we presented the SUpDEq method (Spatial Upsampling by Directional Equalization), which reduces these artifacts by a directional equalization prior to the SH transform. Generally, apart from the spatial resolution of the HRTF set, measurement inaccuracies, for example caused by displacements of the head during the measurement, can influence the spatial upsampling as well. By this direction-depending temporal and spectral deviations are added to the dataset, which in the process of spatial upsampling can cause artifacts comparable to spatial aliasing errors. To reduce the influence of the distance inaccuracies, we present a method for distance error compensation that performs an appropriate distance-shifting of the measured HRTFs. Determining the required values for the shift benefits from the directional equalization performed by SUpDEq and results in time-aligning the directionally equalized HRTFs. We analyze the influence of the angular and distance displacements on spectrum, on interaural cues and on modeled localization performance. While limited angular inaccuracies only have a low impact, already small random distance displacements cause strong impairments, which can be significantly reduced applying the proposed distance error compensation method

    Binaural reproduction of dummy head and spherical microphone array data—A perceptual study on the minimum required spatial resolution

    Get PDF
    Dynamic binaural synthesis requires binaural room impulse responses (BRIRs) for each head orientation of the listener. Such BRIRs can either be measured with a dummy head or calculated from the spherical microphone array (SMA) data. Because the dense dummy head measurements require enormous effort, alternatively sparse measurements can be performed and then interpolated in the spherical harmonics domain. The real-world SMAs, on the other hand, have a limited number of microphones, resulting in spatial undersampling artifacts. For both of the methods, the spatial order N of the underlying sampling grid influences the reproduction quality. This paper presents two listening experiments to determine the minimum spatial order for the direct sound, early reflections, and reverberation of the dummy head or SMA measurements required to generate the horizontally head-tracked binaural synthesis perceptually indistinguishable from a high-resolution reference. The results indicate that for direct sound, N = 9–13 is required for the dummy head BRIRs, but significantly higher orders of N = 17–20 are required for the SMA BRIRs. Furthermore, significantly lower orders are required for the late parts with N = 4–5 for the early reflections and reverberation of the dummy head BRIRs but N = 12–13 for the early reflections and N = 6–9 for the reverberation of the SMA BRIRs

    Efficient binaural rendering of spherical microphone array data by linear filtering

    Get PDF
    High-quality rendering of spatial sound fields in real-time is becoming increasingly important with the steadily growing interest in virtual and augmented reality technologies. Typically, a spherical microphone array (SMA) is used to capture a spatial sound field. The captured sound field can be reproduced over headphones in real-time using binaural rendering, virtually placing a single listener in the sound field. Common methods for binaural rendering first spatially encode the sound field by transforming it to the spherical harmonics domain and then decode the sound field binaurally by combining it with head-related transfer functions (HRTFs). However, these rendering methods are computationally demanding, especially for high-order SMAs, and require implementing quite sophisticated real-time signal processing. This paper presents a computationally more efficient method for real-time binaural rendering of SMA signals by linear filtering. The proposed method allows representing any common rendering chain as a set of precomputed finite impulse response filters, which are then applied to the SMA signals in real-time using fast convolution to produce the binaural signals. Results of the technical evaluation show that the presented approach is equivalent to conventional rendering methods while being computationally less demanding and easier to implement using any real-time convolution system. However, the lower computational complexity goes along with lower flexibility. On the one hand, encoding and decoding are no longer decoupled, and on the other hand, sound field transformations in the SH domain can no longer be performed. Consequently, in the proposed method, a filter set must be precomputed and stored for each possible head orientation of the listener, leading to higher memory requirements than the conventional methods. As such, the approach is particularly well suited for efficient real-time binaural rendering of SMA signals in a fixed setup where usually a limited range of head orientations is sufficient, such as live concert streaming or VR teleconferencing

    Towards the virtualization of a sound source localization acuity test to aid the diagnosis of spatial processing disorder in school-aged children: An experimental approach

    Get PDF
    Spatial hearing is an essential auditory function. It allows us to localize, segregate, and group sound sources in space. Accurate sound source localization is a fundamental ability for understanding and following speech in everyday situations, as it contributes to our capacity to discern between target signal streams and other simultaneous sound sources that can be regarded as noise (cocktail party processing).BMBF, 13FH666IA6, IngenieurNachwuchs 2016: Binaurales Hören in der realen und virtuellen Welt zur Verbesserung der Hör-Erfahrung von Schulkindern (VIWER-S

    3-D Audio in Mobile Communication Devices: Methods for Mobile Head-Tracking

    No full text
    Future generations of mobile communication devices will serve more and more as multimedia platforms capable of reproducing high quality audio. In order to achieve a 3-D sound perception the reproduction quality of audio via headphones can be significantly increased by applying binaural technology. To be independent of individual head-related transfer functions (HRTFs) and to guarantee a good performance for all listeners, an adaptation of the synthesized sound field to the listener's head movements is required. In this article several methods of head-tracking for mobile communication devices are presented and compared. A system for testing the identified methods is set up and experiments are performed to evaluate the prosand cons of each method. The implementation of such a device in a 3-D audio system is described and applications making use of such a system are identified and discussed

    3-D Audio in Mobile Communication Devices: Effects of Self-Created and External Sounds on Presence in Auditory Virtual Environments

    No full text
    This article describes a series of experiments which were carried out to measure the sense of presence in auditory virtual environments. Within the study a comparison of self-created signals to signals created by the surrounding environment is drawn. Furthermore, it is investigated if the room characteristics of the simulated environment have consequences on the perception of presence during vocalization or when listening to speech. Finally the experiments give information about the influence of background signals on the sense of presence. In the experiments subjects rated the degree of perceived presence in an auditory virtual environment on a perceptual scale. It is described which parameters have the most influence on the perception of presence and which ones are of minor influence. The results show that on the one hand an external speaker has more influence on the sense of presence than an adequate presentation of one’s own voice. On the other hand both room reflections and adequately presented background signals significantly increase the perceived presence in the virtual environment
    corecore